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Abstract. In this paper we establish links between, and new results for, three problems that 
are not usually considered together. The first is a matrix decomposition problem that arises in 
areas such as statistical modeling and signal processing: given a matrix X formed as the sum of an 
unknown diagonal matrix and an unknown low rank positive semidefinite matrix, decompose X into 
these constituents. The second problem we consider is to determine the facial structure of the set of 
correlation matrices, a convex set also known as the elliptope. This convex body, and particularly 
its facial structure, plays a role in applications from combinatorial optimization to mathematical 
finance. The third problem is a basic geometric question: given points v\,V2, , ■ ■ ,v n £ M. k (where 
n > k) determine whether there is a centered ellipsoid passing exactly through all of the points. 

We show that in a precise sense these three problems are equivalent. Furthermore we establish 
a simple sufficient condition on a subspace U that ensures any positive semidefinite matrix L with 
column space U can be recovered from D + L for any diagonal matrix D using a convex optimization- 
based heuristic known as minimum trace factor analysis. This result leads to a new understanding 
of the structure of rank-deficient correlation matrices and a simple condition on a set of points that 
ensures there is a centered ellipsoid passing through them. 
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1. Introduction. Decomposing a matrix as a sum of matrices with simple struc- 
ture is a fundamental operation with numerous applications. A matrix decomposition 
may provide computational benefits, such as allowing the efficient solution of the as- 
sociated linear system in the square case. Furthermore, if the matrix arises from 
measurements of a physical process (such as a sample covariance matrix), decom- 
posing that matrix can provide valuable insight about the structure of the physical 
process. 

Among the most basic and well-studied additive matrix decompositions is the 
decomposition of a matrix as the sum of a diagonal matrix and a low-rank matrix. 
This decomposition problem arises in the factor analysis model in statistics, which 
has been studied extensively since Spearman's original work of 1904 [25]. The same 
decomposition problem is known as the Frisch scheme in the system identification 
literature |17j . For concreteness, in Section [l.l| we briefly discuss a stylized version of 
a problem in signal processing that under various assumptions can be modeled as a 
(block) diagonal and low-rank decomposition problem. 

Much of the literature on diagonal and low-rank matrix decompositions is in one 
of two veins. An early approach pQ that has seen recent renewed interest [TT] is an 
algebraic one, where the principal aim is to give a characterization of the vanishing 
ideal of the set of symmetric n x n matrices that decompose as the sum of a diagonal 
matrix and a rank k matrix. Such a characterization has only been obtained for the 
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border cases k = 1, fc = n — 1 (due to Kalman [17]), and the recently resolved k = 2 
case (due to Brouwer and Draisma [3 following a conjecture by Drton et al. [llj. 
This approach does not (yet) offer scalable algorithms for performing decompositions, 
rendering it unsuitable for many applications including those in high-dimensional 
statistics, optics [12], and signal processing [24]. The other main approach to factor 
analysis is via heuristic local optimization techniques, often based on the expectation 
maximization (EM) algorithm |S]. This approach, while computationally tractable, 
typically offers no provable performance guarantees. 

A third way is offered by convex optimization-based methods for diagonal and 
low-rank decompositions such as minimum trace factor analysis (MTFA), the idea 
and initial analysis of which dates at least to Ledermann's 1940 work [21] . MTFA is 
computationally tractable, being based on a semidefinite program (see Section[2|, and 
yet offers the possibility of provable performance guarantees. In this paper we provide 
a new analysis of MTFA that is particularly suitable for high-dimensional problems. 

Semidefinite programming duality theory provides a link between this matrix de- 
composition heuristic and the facial structure of the set of correlation matrices — 
positive semidefinite matrices with unit diagonal — also known as the elliptope [19 . 
This set is one of the simplest of spectrahedra — affine sections of the positive semidef- 
inite cone. Spectrahedra are of particular interest for two reasons. First, spectrahedra 
are a rich class of convex sets that have many nice properties (such as being facially 
exposed). Second, there are well-developed algorithms, efficient both in theory and 
in practice, for optimizing linear functionals over spectrahedra. These optimization 
problems are known as semidefinite programs [30] . 

The elliptope arises in semidefinite programming-based relaxations of problems 
in areas such as combinatorial optimization (e.g. the MAX-CUT problem [2]) and sta- 
tistical mechanics (e.g. the A;- vector spin glass problem 0). In addition, the problem 
of projecting onto the set of (possibly low-rank) correlation matrices has enjoyed con- 
siderable interest in mathematical finance and numerical analysis in recent years }16j . 
In each of these applications the structure of the set of low-rank correlation matrices, 
i.e. the facial structure of this convex body, plays an important role. 

Understanding the faces of the elliptope turns out to be related to the following 
ellipsoid fitting problem: given n points in R k (with n > k), under what conditions on 
the points is there an ellipsoid centered at the origin that passes exactly through these 
points? While there is considerable literature on many ellipsoid-related problems, we 
are not aware of any previous systematic investigation of this particular problem. 

1.1. Illustrative application: direction of arrival estimation. Direction 
of arrival estimation is a classical problem in signal processing where (block) diagonal 
and low-rank decomposition problems arise naturally. In this section we briefly discuss 
some stylized models of the direction of arrival estimation problem that can be reduced 
to matrix decomposition problems of the type considered in this paper. 

Suppose we have n sensors at locations [x\, yi), (x2, 2/2), ■ ■ ■ , {x n ,y n ) & R 2 that 
are passively 'listening' for waves (electromagnetic or acoustic) at a known frequency 
from r <^ n sources in the far field (so that the waves are approximately plane waves 
when they reach the sensors). The aim is to estimate the number of sources r and their 
directions of arrival = (0\, O2, ■ • ■ , r ) given sensor measurements and knowledge of 



the sensor locations (see Figure 1.1) 



A standard mathematical model for this problem (see [IB] for a derivation) is to 
model the vector of sensor measurements z{t) G C n at time t as 

z(t) = A(0)s(t) + n(t) (1.1) 
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Fig. 1.1: Plane waves from directions 0\ and O2 arriving at an array of sensors equally 
spaced on a circle (a uniform circular array). 



where sit) G C r is the vector of baseband signal waveforms from the sources, n(t) e C™ 
is the vector of sensor measurement noise, and A{9) is the n x r matrix with complex 
entries [A(6)]ij = e - fe V=T(*» co S (^)+ yi sin(^)) ) with k a positive constant related to the 
frequency of the waves being sensed. 

The column space of A{9) contains all the information about the directions of 
arrival 9. As such, subspace-based approaches to direction of arrival estimation aim 
to estimate the column space of A(9) (from which a number of standard techniques 
can be employed to estimate 6). 

Typically s(t) and n{t) arc modeled as zero-mean stationary white Gaussian pro- 
cesses with covariances E[s(t)s(t) H ] — P and E[n(t)n(t) H ] = Q respectively (where 
A H denotes the Hermitian transpose of A and E[-] the expectation). In the simplest 
setting, s(t) and n(t) are assumed to be uncorrelated so that the covariance of the 
sensor measurements at any time is 

£ = A(9)PA(6) H + Q. 

The first term is Hermitian positive semidcfinite with rank r, i.e. the number of 
sources. Under the assumption that spatially well-separated sensors (such as in a 
sensor network) have uncorrelated measurement noise Q is diagonal. In this case 
the covariance S of the sensor measurements decomposes as a sum of a positive 
semidcfinite matrix of rank r <C n and a diagonal matrix. Given an approximation 
of £ (e.g. a sample covariance) approximately performing this diagonal and low-rank 
matrix decomposition allows the estimation of the column space of A(ff) and in turn 
the directions of arrival. 

A variation on this problem occurs if there are multiple sensors at each location, 
sensing, for example, waves at different frequencies. Again under the assumption that 
well-separated sensors have uncorrelated measurement noise, and sensors at the same 
location have correlated measurement noise, the sensor noise covariance matrix Q 
would be block- diagonal. As such the covariance of all of the sensor measurements 
would decompose as the sum of a low-rank matrix (with rank equal to the total 
number of sources over all measured frequencies) and a block-diagonal matrix. 

A block-diagonal and low-rank decomposition problem also arises if the second- 
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order statistics of the noise have certain symmetries. This might occur in cases where 
the sensors themselves are arranged in a symmetric way (such as in the uniform 
circular array shown in Figure [L~T] ). In this case there is a unitary matrix T (depending 
only on the symmetry group of the array) such that TQT H is block- diagonal |25j . 
Then the covariance of the sensor measurements, when written in coordinates with 
respect to T, is 

TYiT h = T A(9)PA(6) H T H + TQT H 

which has a decomposition as the sum of a block diagonal matrix and a rank r 
Hermitian positive semidefinite matrix (as conjugation by T does not change the 
rank of this term). 

Note that the matrix decomposition problems discussed in this section involve 
Hermitian matrices with complex entries, rather than the symmetric matrices with 
real entries considered elsewhere in this paper. It is straightforward to generalize the 
main problems and results throughout the paper to the complex setting. 

1.2. Contributions. 

Relating MTFA, correlation matrices, and ellipsoid fitting. We introduce and 
make explicit the links between the analysis of MTFA, the facial structure of the 
elliptope, and the ellipsoid fitting problem, showing that these problems are, in a 



precise sense, equivalent (see Proposition 3.1). As such, we relate a basic problem in 



statistical modeling (tractable diagonal and low-rank matrix decompositions), a basic 
problem in convex algebraic geometry (understanding the facial structure of perhaps 
the simplest of spectrahedra) , and a basic geometric problem. 

A sufficient condition for the three problems. The main result of the paper is to 
establish a new, simple, sufficient condition on a subspace U of W 1 that ensures that 
MTFA correctly decomposes matrices of the form D* + L* where U is the column 
space of L*. The condition is stated in terms of a measure of coherence of a subspace 



(made precise in Definition 4.1 1. Informally, the coherence of a subspace is a real 
number between zero and one that measures how close the subspace is to containing 
any of the elementary unit vectors. This result can be translated into new results for 
the other two problems under consideration based on the relationship between the 
analysis of MTFA, the faces of the elliptope, and ellipsoid fitting. 

Block-diagonal and low-rank decompositions. In Section [5] we turn our attention 
to the Wocfc-diagonal and low-rank decomposition problem, showing how our results 
generalize to that setting. Our arguments combine our results for the diagonal and 
low-rank decomposition case with an understanding of the symmetries of the block- 
diagonal and low-rank decomposition problem. 

1.3. Outline. The remainder of the paper is organized as follows. We describe 
notation, give some background on semidefinite programming, and provide precise 
problem statements in Section [2j In Section [3] we present our first contribution by 
establishing relationships between the success of MTFA, the faces of the elliptope, and 
ellipsoid fitting. We then illustrate these connections by noting the equivalence of a 
known result about the faces of the elliptope, and a known result about MTFA, and 
translating these into the context of ellipsoid fitting. Section [4] is focused on estab- 
lishing and interpreting our main result: a sufficient condition for the three problems 
based on a coherence inequality. Finally in Section [5] we generalize our results to the 
analogous tractable block-diagonal and low-rank decomposition problem. 

2. Background and problem statements. 
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2.1. Notation. If x,y G R™ wc denote by (x,y) — Y^i=i x iVi the standard 
Euclidean inner product and by ||.x|| 2 = {x.x) 1 / 2 the corresponding Euclidean norm. 
We write x > and x > to indicate that x is entry-wise non-negative and strictly 
positive, respectively. Correspondingly if X, Y € S n , the set of n x n symmetric 
matrices, then we denote by (X, Y) = ti(XY) the trace inner product and by ||^||f — 
(X,Xy/ 2 the Frobenius norm. We write X >z and X >- to indicate that X is 
positive semidefinite and strictly positive definite, respectively. We write <S? for the 
cone of n x n positive semidefinite matrices. 

The column space of a matrix X is denoted 1Z(X) and the nullspace is denoted 
Af(X). If X is an n x n matrix then diag(X) £ K™ is the diagonal of X. If x € E" then 
diag*(x) £ 5" is the diagonal matrix with [diag*(a;)]jj = X{ for i = 1,2, . . . ,n. If U is 
a subspace of W 1 then : R" — > R™ denotes the orthogonal projector onto U, that 
is the self-adjoint linear map such that IZ(Pu) = U, Py — Pu and tr(Pu) = dim(W). 

We use the notation for the vector with a one in the ith position and zeros 
elsewhere and the notation 1 to denote the vector all entries of which are one. We 
use the shorthand [n] for the set {1, 2, . . . , n}. The set ofnxn correlation matrices, 
i.e. positive semidefinite matrices with unit diagonal, is denoted £ n . For brevity we 
typically refer to £ n as the elliptope, and the elements of £ n as correlation matrices. 

2.2. Semidefinite programming. The term semidefinite programming [30] 
refers to convex optimization problems of the form 

minimize (C,X) subject to j ~^^2 . \ (2.1) 



where X and C are n x n symmetric matrices, 6 € K m , and A : 5" — > M. m is a linear 
map. The dual semidefinite program is 

maximize (b,y) subject to | ^ ^ ^_ ^ (2-2) 

where .4* : M m -> 5" is the adjoint of A. 

General semidefinite programs can be solved in polynomial time using interior 
point methods [30] . While our focus in this paper is not on algorithms, we remark 
that for the structured semidefinite programs discussed in this paper, many different 
special-purpose methods have been devised. 

The main result about semidefinite programming that we use is the following 
optimality condition (see [3D] for example). 



Theorem 2.1. Suppose (2.1) and (2.2 1 are strictly feasible. Then X* and 



(y*,S*) are optimal for the primal (2.1) and dual (2.2) respectively if and only if 



X* is primal feasible, (y*,S*) is dual feasible and X*S* = 0. 

2.3. Tractable diagonal and low-rank matrix decompositions. To decom- 
pose X into a diagonal part and a positive semidefinite low-rank part, we may try to 
solve the following rank minimization problem 



minimize rank(L) subject to ^ L > Q 



X = D + L 
D diagonal. 



Since the rank function is non-convex and non-differentiable, it is not clear how to 
solve this optimization problem directly. One approach that has been successful for 
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other rank minimization problems (for example those in |22J .23;), is to replace the 
rank function with the trace function in the objective. This can be viewed as a 
convexification of the problem as the trace function is the convex envelope of the rank 
function when restricted to positive semidefinite matrices with spectral norm at most 
one. Performing this convexification leads to the semidefinite program we refer to as 
minimum trace factor analysis (MTFA) : 



It has been shown by Delia Riccia and Shapiro [7] that if MTFA is feasible it has a 
unique optimal solution. One central concern of this paper is to understand when the 
diagonal and low-rank decomposition of a matrix given by MTFA is 'correct' in the 
following sense. 

Recovery problem I. Suppose X is a matrix of the form X = D* + L* where D* 
is diagonal and L* is positive semidefinite. What conditions on (D*,L*) ensure that 
(D*, L*) is the unique optimum of MTFA with input XI 

We establish in Section [3] that whether (D*, L*) is the unique optimum of MTFA 
with input X = D* + L* depends only on the column space of L* , motivating the 
following definition. 

Definition 2.2. A subspace U o/M™ is recoverable by MTFA if for every di- 
agonal D* and every positive semidefinite L* with column space U, (D*,L*) is the 
unique optimum of MTFA with input X = D* + L* . 

In these terms, we can restate the recovery problem succinctly as follows. 

Recovery problem II. Determine which subspaces of ]R™ are recoverable by MTFA. 

Much of the basic analysis of MTFA, including optimality conditions and rela- 
tions between minimum rank and minimum trace factor analysis, was carried out in 
a sequence of papers by Shapiro H3 HH] and Delia Riccia and Shapiro [7 . More 
recently, Chandrasekaran et al. [6] and Candes et al. 4 a considered convex optimiza- 
tion heuristics for decomposing a matrix as a sum of a sparse and low-rank matrix. 
Since a diagonal matrix is certainly sparse, the analysis in [5] can be specialized to 
give fairly conservative sufficient conditions for the success of MTFA. 

The diagonal and low-rank decomposition problem can also be interpreted as a 
low-rank matrix completion problem, where we are given all the entries of a low-rank 
matrix except the diagonal, and aim to correctly reconstruct the diagonal entries. As 
such, this paper is closely related to the ideas and techniques used in the work of 
Candes and Recht [5] and a number of subsequent papers on this topic. We would 
like to emphasize a key point of distinction between that line of work and the present 
paper. The recent low-rank matrix completion literature largely focuses on deter- 
mining the proportion of randomly selected entries of a low-rank matrix that need 
to be revealed to be able to reconstruct that low-rank matrix using a tractable algo- 
rithm. The results of this paper, on the other hand, can be interpreted as attempting 
to understand which low-rank matrices can be reconstructed from a fixed and quite 
canonical pattern of revealed entries. 

2.4. Faces of the elliptope. The faces of the cone of n x n positive semidefinite 
matrices are all of the form 



minimize tr(L) subject to 




D diagonal. 



D + L 



(2.3) 



Tu = {A y : Af(X) 2 U} 



(2.4) 
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where U is a subspace of R™ [19] . Conversely given any subspace U of R™ , Tu is a 
face of <S". As a consequence, the faces of £ n are all of the form 

£ n nJ r u ={XhO: M{X) D U, diag(A) = 1} (2.5) 

where U is a subspace of R™ [T5]. It is not the case, however, that for every subspace 
U of R n there is a correlation matrix with nullspace containing U, motivating the 
following definition. 

Definition 2.3 (19). A subspace U of R n is realizable if there is an n x n 
correlation matrix Q such that N{Q) 2 U . 

The problem of understanding the facial structure of the set of correlation matrices 
can be restated as follows. 

Facial structure problem. Determine which subspaces of R n are realizable. 

Much is already known about the faces of the elliptope. For example, all possible 
dimensions of faces as well as polyhedral faces, are known [20]. Characterizations of 
the realizable subspaces of R™ of dimension 1, n — 2, and n — 1 are given in [5] and 
implicitly in [TH] and [5D]. Nevertheless, little is known about which k dimensional 
subspaces of R™ are realizable for general n and k. 

2.5. Ellipsoid fitting. 

Ellipsoid fitting problem I. What conditions on a collection of n points in R fe 
ensure that there is a centered ellipsoid passing exactly through all those points? 
Let us consider some basic properties of this problem. 

Number of points. If n < k we can always fit an ellipsoid to the points. Indeed if 
V is the matrix with columns V\, i>2, ■ ■ ■ , v n then the image of the unit sphere in R" 
under V is a centered ellipsoid passing through v%, v%, . . . , v n . If n > ( k ^ 1 ) and the 
points are 'generic' then we cannot fit a centered ellipsoid to them. This is because 
if we represent the ellipsoid by a symmetric k x k matrix M, the condition that it 
passes through the points (ignoring the positivity condition on M) means that M 
must satisfy n linearly independent equations. 

Invariances. If T G GL(k) is an invertiblc linear map then there is an ellip- 
soid passing through V\ , 1)2, ■ ■ ■ , v n if and only if there is an ellipsoid passing through 
Tv\,Tv2, ■ ■ ■ ,Tv n . This means that whether there is an ellipsoid passing through n 
points in K fc does not depend on the actual set of n points, but on a subspace of R" 
related to the points. We summarize this observation in the following lemma. 

Lemma 2.4. Suppose V is a k x n matrix with row space V. If there is a centered 
ellipsoid in M. k passing through the columns of V then there is a centered ellipsoid 
passing through the columns of any matrix V with row space V. 

Lemma |2.4| asserts that whether it is possible to fit an ellipsoid to v±, v%, . . . , v n 
depends only on the row space of the matrix with columns given by the Vi , motivating 
the following definition. 

Definition 2.5. A subspace V ofR n has the ellipsoid fitting property if there is 
a k x n matrix V with row space V and a centered ellipsoid in M. k that passes through 
each column ofV. 

As such we can restate the ellipsoid fitting problem as follows. 

Ellipsoid fitting problem II. Determine which subspaces of M.™ have the ellipsoid 
fitting property. 

3. Relating ellipsoid fitting, diagonal and low-rank decompositions, 
and correlation matrices. In this section we show that the ellipsoid fitting prob- 
lem, the recovery problem, and the facial structure problem are equivalent in the 
following sense. 
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Proposition 3.1. Lethl be a subspace o/R". Then the following are equivalent: 

1. hi is recoverable by MTFA. 

2. hi is realizable. 

3. hi 1 - has the ellipsoid fitting property. 

Proof. To see that [2] implies [SJ let V be a k x n matrix with nullspace hi and 
let Vi denote the ith column of V. If hi is realizable there is a correlation matrix Y 
with nullspace containing hi. Hence there is some M > such that Y = V T MV and 
vfMvi = 1 for i g [n]. Since V" has nullspace hi, it has row space li . Hence the 
subspace hi 1 - has the ellipsoid fitting property. By reversing the argument we see that 
the converse also holds. 

The equivalence of[T] and [2] arises from semidcfinite programming duality. Follow- 



ing a slight reformulation, MTFA (2.3| can be expressed as 



maximize (1, d) subject to I ^ + L (3.1) 

d.L \ L CL U 



and its dual as 

which is clearly just the optimization of the linear functional defined by X over the 



minimize (X, Y) subject to <j dlag ^ 1 (3.2) 



elliptope. We note that (3.1 ) is exactly in the standard dual form (2.2 1 for semidefinite 



programming and correspondingly that (3.2) is in the standard primal form (2.1) for 
semidcfinite programming. 

Suppose hi is recoverable by MTFA. Fix a diagonal matrix D* and a positive 



semidefinite matrix L* with column space hi and let X = D* + L* . Since (3.1) 



and (3.2) are strictly feasible, by Theorem 2.1 (optimality conditions for semidefinite 



programming), the pair (diag (D*), L*) is an optimum of (3.1) if and only if there is 
some correlation matrix Y* such that Y*L* = 0. Since 1Z(L*) = hi this implies that 
hi is realizable. Conversely, if hi is realizable, there is some Y* such that Y*L* = 
for every L* with column space hi, showing that hi is recoverable by MTFA. □ 

Remark. We note that in the proof of Proposition 
two versions of the recovery problem stated in Section 



3.1 



2.3 



we established that the 
are actually equivalent. 
In particular, whether (D*,L*) is the optimum of MTFA with input X = D* + L* 
depends only on the column space of L*. 

3.1. Certificates of failure. We can prove that a subspace hi is realizable by 
constructing a correlation matrix with nullspace containing hi. We can prove that 
a subspace is not realizable by constructing a matrix that certifies this fact. Geo- 
metrically, a subspace hi is realizable if and only if the subspace Cu = {X 6 S n : 
M(X) D hi} of symmetric matrices intersects with the elliptope. So a certificate that 
hi is not realizable is a hyperplane in the space of symmetric matrices that strictly 
separates the elliptope from Cu ■ The following lemma describes the structure of these 
separating hyperplanes. 

Lemma 3.2. A subspace hi ofW 1 is not realizable if and only if there is a diagonal 
matrix D such that tr(D) > and v T Dv < for all v € hi 1 - . 



Proof. By Proposition 3.1 hi is not realizable if and only if hi- 1 does not have 
the ellipsoid fitting property. Let V be a k x n matrix with row space hi . Then U 1 - 
does not have the ellipsoid fitting property if and only if we cannot find an ellipsoid 
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passing through the columns of V, i.e. the semidefinite program 

• • ■ , n » r\ u • f diag(^ T Ml/) = 1 .„ oN 

minimize (0, M) subject to < MyQ ' ' 



is infeasible. The semidefinite programming dual of (3.3) is 



maximize (d, 1) subject to { V^diag*(<i)y T ^ 0. (3-4) 



Since (3.4 1 is clearly always feasible, by strong duality (which holds because both 



primal and dual problems are strictly feasible) (3.3) is infeasible if and only if (3.4) 
is unbounded. This occurs if and only if there is some d with di > and yet 

V diag* (d)V T ^ 0. Then D = diag*(d) has the properties in the statement of the 
lemma. □ 

3.2. Exploiting connections: results for one dimensional subspaces. In 

1940, Ledermann |21j characterized the one dimensional subspaces that are recover- 
able by MTFA. In 1990, Grone et al. [TS] gave a necessary condition for a subspace 
to be realizable. In 1993, independently of Ledermann's work, Delorme and Poljak [5] 
showed that this condition is also sufficient for one dimensional subspaces. Since we 
have established that a subspace is recoverable by MTFA if and only if it is realizable, 
Ledermann's result and Delorme and Poljak's results are equivalent. In this section 
we translate these equivalent results into the context of the ellipsoid fitting problem, 
giving a geometric characterization of when it is possible to fit a centered ellipsoid to 
k + 1 points in R k . 

Delorme and Poljak state their result in terms of the following definition. 

Definition 3.3 ([8]). A vector iiel" is balanced if, for all i e [n], 

H<EM- (3-5) 

If the inequality is strict we say that u is strictly balanced. 

In the following, the necessary condition is due to Grone et al. [T5] and the 
sufficient condition is due to Ledermann [3T] (in the context of the analysis of MTFA) 
and Delorme and Poljak [5] (in the context of the facial structure of the elliptope). 
We state the result only in terms of realizability of a subspace. 

Theorem 3.4. If a subspace U ofW is realizable then every u € 11 is balanced. 
IflA = span{u} is one- dimensional then IA is realizable if and only if u is balanced. 

The balance condition has a particularly natural geometric interpretation in the 



ellipsoid fitting setting (Lemma 3.5 below). The proof is a fairly straightforward 
application of linear programming duality, which we defer to Appendix [Aj 

Lemma 3.5. Suppose V is any k x n matrix with Af(V) = U. Denote the 
columns of V by v\, V2, ■ ■ ■ , v n G WL k . Then every u € U is balanced if and only if for 
each i € [n], Uj lies on the boundary of the convex hull of ±vi,±v 2 , ■ ■ ■ ,±t)„. 

By combining Theorem |3.4| with Lemma |3.5[ we are in a position to interpret 



Theorem 3.4 purely in terms of ellipsoid fitting. 

Corollary 3.6. If there is an ellipsoid passing through ±wi, ±t>2, . . . , ±w„ 6 K fc 
then ±ui, ±«2, . . . , ±t> n H e on the boundary of their convex hull. If, in addition, 
k = n — 1 the converse also holds. 

We note that ±v± , ±i>2 > ■ ■ • ? i^n he on the boundary of their convex hull if and 
only if there exists some convex set with boundary containing ±ui, ±«2, . . . , ±w„. In 
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this geometric setting, it is clear that this is a necessary condition to be able to find 
a centered ellipsoid passing through the points, but not so obvious that it is sufficient 
if A; = n — 1. 

4. A sufficient condition for the three problems. In this section we estab- 
lish a new sufficient condition for a subspace U of R n to be realizable and consequently 
a sufficient condition for U to be recoverable by MTFA and to have the ellipsoid 
fitting property. Our condition is based on a simple property of a subspace known as 
coherence. 

Given a subspace U of R™ , the coherence of U is a measure of how close the sub- 
space is to containing any of the elementary unit vectors. This notion was introduced 
(with a different scaling) by Candes and Recht in their work on low-rank matrix com- 
pletion [5], although related quantities have played an important role in the analysis 
of sparse reconstruction problems since the work of Donoho and Huo [TU] . 

Definition 4.1. If U is a subspace ofW 1 then the coherence ofU is 

H(U) = maxllP^e.Ha. 

te[n] 



A basic property of coherence is that it satisfies the inequality 

< m < i (4.i) 

n 

for any subspace U of W 1 [5 . This inequality, together with the definition of coherence, 
provides useful intuition about the properties of subspaces with low coherence, that is 
incoherence. Any subspace with low coherence is necessarily of low dimension and far 
from containing any of the elementary unit vectors . As such, any symmetric matrix 
with incoherent row/column spaces is necessarily of low-rank and quite different from 
being a diagonal matrix. 

4.1. Coherence-threshold-type sufficient conditions. In this section we fo- 
cus on finding the largest possible a such that 

li{U) < a U is realizable, 

that is finding the best possible coherence-threshold-type sufficient condition for a 
subspace to be realizable. Such conditions arc of particular interest because the 
dependence they have on the ambient dimension and the dimension of the subspace is 



only the mild dependence implied by (4.1 ). In contrast, existing results (e.g. jBl [2"0"l[lT?] ) 
about realizability of subspaces hold only for specific combinations of the ambient 
dimension and the dimension of the subspace. 

The following theorem, our main result, gives a sufficient condition for realizability 
based on a coherence-threshold condition. Furthermore, it establishes that this is the 
best possible coherence-threshold-type sufficient condition. 

Theorem 4.2. IfU is a subspace ofW 1 and fi(U) < 1/2 thenU is realizable. On 
the other hand, given any a > 1/2, there is a subspace U with fi(U) — a that is not 
realizable. 

Proof. We give the main idea of the proof, deferring some details to Appendix [A} 
Instead of proving that there is some Y E Tu = {Y >z : Af{Y) ^>U} such that Y it = 1 
for i G [n], it suffices to choose a convex cone K, that is an inner approximation to 
Tu and establish that there is some Y e JC such that Ya — 1 for i G [n]. One natural 
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choice is to take JC = {Pu± diag* (X)Pu± : A > 0}, which is clearly contained in Tu- 
Note that there is some Y £ JC such that Ya = 1 for all i € [n] if and only if there is 
A > such that 

diag (P w x diag* {\)P U ±) = 1. (4.2) 

The rest of the proof of the sufficient condition involves showing that if fi(U) < 1/2 
then such a non-negative A exists. We establish this in Lemma [A. 1| 

Now let us construct, for any a > 1/2, a subspace with coherence a that is not 
realizable. Let U to be the subspace of M 2 spanned by u = (y/a, yl — a). Then 



l-J.(U) = max{a, 1 — a} = a and yet by Theorem 3.4 U is not realizable because u is 
not balanced. □ 

Remarks. Theorem |4.2| illustrates both the power and limitations of coherence- 
threshold-type conditions. On the one hand, since coherence is quite a coarse property 



of a subspace, the result applies to 'many' subspaces (see Proposition 4.6 in Sec- 
tion 4.3). On the other hand, since coherence has very mild dimension dependence, 
the power of coherence-threshold-type conditions is limited to their specialization to 
low-dimensional situations, such as one dimensional subspaces of M 2 . 



4.2. Interpretations of Theorem |4.2[ We now establish two corollaries of our 
coherence-threshold-type sufficient condition for realizability. These corollaries can be 
thought of as re-interpretations of the coherence inequality [i(lA) < 1/2 in terms of 
other natural quantities. 

An ellipsoid-fitting interpretation. With the aid of Proposition |3. 1| we reinterpret 
our coherence-threshold-type sufficient condition as a sufficient condition on a set of 
points in E fc that ensures there is a centered ellipsoid passing through them. The 
condition involves 'sandwiching' the points between two ellipsoids (that depend on 
the points) . Indeed, given v\ , v%, . . . , v n € R k and < (3 < 1 we define the ellipsoid 

£f,( Vl ,. ..,v n ) = {xe R k : x T (E; =1 vjvf)-^ < f3}. 

Definition 4.3. Given < (3 < 1 the points v±,V2, ■ ■ ■ ,v n satisfy the /3-sandwich 
condition if 

{vi, v 2 , ■ ■ ■ , v n ] C £x{vi, ■ ■ ■ , v n ) \ £p(vi, . . . , v n ). 



The intuition behind this definition (illustrated in Figure 4.1 ) is that if the points 
satisfy the /3-sandwich condition for (3 close to one, then they are confined to a thin 
elliptical shell that is adapted to their position. One might expect that it is 'easier' 
to fit an ellipsoid to points that are confined in this way. Indeed this is the case. 

Corollary 4.4. If vi, v 2 , . . . , v n € K fc satisfy the 1/2-sandwich condition then 
then there is a centered ellipsoid passing through V\, V%, . . . ,v n . 

Proof. Let V be the k x n matrix with columns given by the Vi, and let U be the 
nullspace of V. Then the orthogonal projection onto the row space of V is Pu± , and 
can be written as 

P u ± = v T {vv T )- 1 v. 

Our assumption that the points satisfy the 1/2-sandwich condition is equivalent to 
assuming that 1/2 < [PuAa — 1 f° r a ^ * •= i n ] 01 alternatively that 

H(U) = max[P u ],j = 1 - mm[P u ±] u < 1/2. 

i£[ri] ie[n] 
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Fig. 4.1: The ellipsoids shown are £ — £% (t>i , v 2 , v 3 ) and £' — £1 / 2 («i , «2, v 3) • There is 
an ellipsoid passing through v% , V2 and ^3 because the points are sandwiched between 
£ and £', 



From Theorem 



4.2 we know that n(U) < 1/2 implies that U is realizable. Invoking 
we then conclude that there is a centered ellipsoid passing through 



Proposition |3.1 
v 1 ,v 2 , ...,v n . □ 

A balance interpretation. In Section[3.2|we saw that if a subspace U is realizable, 



every u & U is balanced. The sufficient condition of Theorem 4.2 can be expressed in 
terms of a balance condition on the element-wise square of the elements of a subspace. 
(In what follows u o u denotes the element- wise square of a vector in K™.) 

Corollary 4.5. Suppose IA is a subspace ofW. Ifuou is strictly balanced for 
every u £lA then U is realizable. 

Proof. It suffices to show that if for every u € U, u o u is strictly balanced, then 
n(U) < 1/2 (although we could reverse the argument to establish the equivalence of 
these conditions) . If u o u is strictly balanced for all u € U then for all i £ [n] and all 
ueU 



n 
.7 = 1 



(4.3) 



< 1. 



Since ||ft/ej||2 = max uSM yr }(ei, u)/||u||2, it follows from (4.3) that 2||P^ei _ 
Since this holds for all i £ [n] it follows that fi(U) < 1/2. □ 

Remark. Suppose U — span{u} is a one-dimensional subspace of ffi™. We have 
just established that if u o u is strictly balanced then U is realizable and so (by 
Theorem 3.4) u must be balanced. We note that it is straightforward to establish 



directly that if u o u is balanced then u is balanced by using the definition of balance 
and the fact that ||x||i > ||x||2 for any x £ M. n . 



4.3. Examples. To gain more intuition for what Theorem |4.2| means, we con- 
sider its implications in two particular cases. First, we compare the ch arac terization 
of when it is possible to fit an ellipsoid to k+ 1 points in R k (Corollary 3.6) with the 
specialization of our sufficient condition to this case (Corollary 4.4). This comparison 
provides some insight into how conservative our sufficient condition is. Second, we 
investigate the coherence properties of suitably 'random' subspaces. This provides in- 
tuition about whether or not n(U) < 1/2 is a very restrictive condition. In particular, 
we establish that 'most' subspaces of K n with dimension bounded above by (1/2 — e)n 
are realizable. 
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(a) The shaded set is R, those points v 
for which we can fit an ellipsoid through 
n and the standard basis vectors. 



(b) The shaded set is R' , those points v 
such that v, e\ and e2 satisify the condi- 
tion of Corollary 14.41 



Fig. 4.2: Comparing our sufficient condition for ellipsoid fitting (Corollary 4.4) with 
the characterization (Corollary 3.6) in the case of fitting an ellipsoid to fc + 1 points 



Fitting an ellipsoid to fc + 1 points in K fe . Recall that Ledermann and Delorme 
and Poljak's result, interpreted in terms of ellipsoid fitting, tells us that we can fit 
an ellipsoid to fc + 1 points v\, . . . , Ufc+i G M fe if and only if those points are on 



the boundary of the convex hull of {±Ui, . . . ,±uj, +1 } (see Corollary 3.6). We now 
compare this characterization with the 1/2-sandwich condition, which is sufficient by 
Corollary |4.4| 

Without loss of generality we assume that fc of the points are e\, . . . ,e^, the 
standard basis vectors, and compare the conditions by considering the set of locations 
of the fc + 1st point v € lR fc for which we can fit an ellipsoid through all fc + 1 points. 
Corollary |3 ,6| gives a characterization of this region as 

fc 

R = {v e R k : N > 1. Kl ~ M < 1 for 1 G [*]} 

which is shown in Figure |4.2a| in the case fc = 2. The set of v such that v, e\, . . . , e n 
satisfy the 1/2-sandwich condition can be written as 

R' = {v e R k : v T (I + vv T )- 1 v > 1/2, ef(I + vv T )~ 1 e l > 1/2 for i E [fc]} 

k 

= {v e R k : Y v ] > X ' v * ~ V J < 1 for * e W } 



which is shown in Figure 4.2b It is clear that R' C R. 

Realizability of random subspaces. Suppose U is a subspace generated by taking 
the column space of an n x r matrix with i.i.d. standard Gaussian entries. For what 
values of r and n does such a subspace have [i(lA) < 1/2 with high probability, 
i.e. satisfy our sufficient condition for being realizable? 
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The following result essentially shows that for large n, 'most' subspaces of di- 
mension at most (1/2 — e)n are realizable. This suggests that MTFA is a very good 
heuristic for diagonal and low-rank decomposition problems in the high-dimensional 
setting. Indeed 'most' subspaces of dimension up to one half the ambient dimension — 
hardly just low-dimensional subspaces — are recoverable by MTFA. 

Proposition 4.6. Let < e < 1/2 be a constant and suppose n > 6/(e 2 — 2e 3 ). 
There are positive constants c, c, ( depending only on e) such that if IA is a random 
(1/2 — e)n dimensional subspace o/K." then 

Pr[U is realizable] > 1 — c\fne~ cn . 



We provide a proof of this result in Appendix [A] The main idea is that the 
coherence of a random r dimensional subspace of K™ is the maximum of n random 
variables that concentrate around their mean of r/n for large n. 

To illustrate the result, we consider the case where e = 1/4 and n > 192. Then 
(by examining the proof in Appendix [Aj we see that we can take c = 1 /24 and 
c = 24/v / 37r ~ 7.8. Hence if n > 192 and U is a random n/4 dimensional subspace of 
W 1 we have that 

Vi[U is realizable] > 1 - 7.8y/n~e- n/24 . 

5. Tractable block diagonal and low-rank decompositions and related 
problems. In this section we generalize our results to the analogue of MTFA for 
Wocfc-diagonal and low-rank decompositions. Mimicking our earlier development, we 
relate the analysis of this variant of MTFA to the facial structure of a variant of 
the elliptope and a generalization of the ellipsoid fitting problem. The key point is 
that these problems all possess additional symmetries that, once taken into account, 
essentially allow us to reduce our analysis to cases already considered in Sections [3] 
andg] 

Throughout this section, let V be a fixed partition of {1,2, ... ,n}. We say a 
matrix is 'P-block-diagonal if it is zero except for the principal submatrices indexed 
by the elements of V . We denote by blkdiag-p the map that takes annxn matrix and 
maps it to the principal submatrices indexed by V . Its adjoint, denoted blkdiagp, 
takes a tuple of symmetric matrices {Xx)xep and produces an n x n matrix that is 
P-block diagonal with blocks given by the Xj. 

We now describe the analogues of MTFA, ellipsoid fitting, and the problem of 
determining the facial structure of the elliptope. 

Block minimum trace factor analysis. If X = B* + L* where B* is P-block- 
diagonal and L* > is low rank, the obvious analogue of MTFA is the semidefinite 
program 

( X = B + L 

minimize tr(L) subject to < L y (5-1) 

[ B is P-block-diagonal 

which we call block minimum trace factor analysis (BMTFA) . 

Definition 5.1. A subspace U of W 1 is recoverable by BMTFA if for every 
B* that is V -block-diagonal and every positive semidefinite L* with column space U , 
(B* , L*) is the unique optimum of BMTFA with input X = B* + L* . 
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Faces of the V -elliptope. Just as MTFA is related to the facial structure of the 
elliptope, BMTFA is related to the facial structure of the spectrahedron 

&p={YhO: blkdiagp(y) = (/,/,..., I)}. 

We refer to S-p as the "P-elliptope. We extend the definition of a realizable subspace 
to this context. 

Definition 5.2. A subspace U ofW 1 is ^-realizable if there is some Y 6 £ p 
such thatN{Y) D U. 

Generalized ellipsoid fitting. To describe the "P-ellipsoid fitting problem we first 
introduce some convenient notation. If I C [n] we write 

S x = {x e 1" : ||z|| a = 1, Xj = if j $ 1} (5.2) 

for the intersection of the unit sphere with the coordinate subspace indexed by I. 

Suppose v\ , V2 , • ■ • , v n G K fe is a collection of points and V is the k x n matrix 
with columns given by the u,. Noting that = {— ej,e,-}, and thinking of V as 
a linear map from K™ to R k , we see that the ellipsoid fitting problem is to find an 
ellipsoid in K fe with boundary containing U ie t n ]V(S^), i.e. the collection of points 
±«X > • • • j ±w n . The "P-ellipsoid fitting problem is then to find an ellipsoid in M. k with 
boundary containing Ux6-pV A (S' 1 ), i.e. the collection of ellipsoids V(S X ). 

The generalization of the ellipsoid fitting property of a subspace is as follows. 

Definition 5.3. A subspace V ofW 1 has the "P-ellipsoid fitting property if there 
is a kxn matrix V with row space V such that there is a centered ellipsoid in K fc with 
boundary containing Ux^-pV(S x ). 

5.1. Relating the generalized problems. The facial structure of the V- 
elliptope, BMTFA, and the P-ellipsoid fitting problem are related by the following 



result, the proof of which is omitted as it is almost identical to that of Proposition 3.1 



Proposition 5.4. LetU be a subspace ofW 1 . Then the following are equivalent: 

1. IA is recoverable by BMTFA. 

2. 11 is V -realizable. 

3. U has the V -ellipsoid fitting property. 



The following lemma is the analogue of Lemma 3.2 It describes certificates that 
a subspace U is not ^-realizable. Again the proof is almost identical to that of 
Lemma 13.21 so we omit it. 

Lemma 5.5. A subspace U ofW 1 is not V -realizable if and only if there is a 
V -block- diagonal matrix B such that tr(B) > and v T Bv < for all v £ U 1 - . 

For the sake of brevity, in what follows we only discuss the problem of whether U 
is ^-realizable without explicitly translating the results into the context of the other 
two problems. 

5.2. Symmetries of the P-elliptope. We now consider the symmetries of the 
■P-elliptope. Our motivation for doing so is that it allows us to partition subspaces 
into classes for which either all elements are ^-realizable or none of the elements are 
■p-realizable. 

It is clear that the P-elliptope is invariant under conjugation by "P-block-diagonal 
orthogonal matrices. Let G-p denote this subgroup of the group ofnxn orthogonal 
matrices. There is a natural action of G-p on subspaces of K™ defined as follows. If 
P G G-p and U is a subspace of R" then P ■ U is the image of the subspace U under 
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the map P. (It is straightforward to check that this is a well defined group action.) 
If there exists some P € G-p such that P ■ hi = hi' then we write hi ~ hi' and say 
that hi and hi' are equivalent. We care about this equivalence relation on subspaces 
because the property of being ^-realizable is really a property of the corresponding 
equivalence classes. 

Proposition 5.6. Suppose U andhi' are subspaces ofW 1 . If hi ~ W thenU is 
V -realizable if and only if hi' is V -realizable. 

Proof. If hi is ^-realizable there is Y £ £p such that Yu = for all u € hi. 
Suppose hi' = P -hi for some P £ G v and let Y' = PYP T . Then Y' E £ v and 
Y'(Pu) = (PYP T )(Pu) = for all uehi. By the definition ofW it is then the case 
that Y'u' = for all u' € hi'. Hence hi' is ^-realizable. The converse clearly also 
holds. □ 

5.3. Exploiting symmetries: relating realizability and "P-realizability. 

For a subspace of M. n , we now consider how the notions of 'P-realizability and realiz- 
ability (i.e. [n]-realizability) relate to each other. Since £-p C £„, if IA is ^-realizable, 
it is certainly also realizable. While the converse does not hold, we can establish the 
following partial converse, which we subsequently use to extend our analysis from 
Sections [3] and [4] to the present setting. 

Theorem 5.7. A subspace hi ofW 1 is V '-realizable if and only if hi' is realizable 
for every hi' such that hi' ~ hi. 

Proof. We note that one direction of the proof is obvious since P-realizability 
implies realizability. It remains to show that if hi is not ^-realizable then there is 
some hi' equivalent to hi that is not realizable. 

Recall from Lemma [5~5| that if hi is not P-realizable there is some P-block-diagonal 
X with positive trace such that v T Xv < for all v € hi- 1 . Since X is 'P-block- 
diagonal there is some P e G-p such that PXP T is diagonal. Since conjugation by 
orthogonal matrices preserves eigenvalues, tr(PX P T ) — tr(X) > 0. Furthermore 
v T (PXP T )v = {P T v) T X(P T v) < for all P T v e . Hence w T (PXP T )w > for 



all w e P ■ hi 1 - = (P ■ hi)- 1 . By Lemma [3T2| PXP T is a certificate that P ■ hi is not 
realizable, completing the proof. □ 



The power of Theorem |5.7| lies in its ability to turn any condition for a subspace 
to be realizable into a condition for the subspace to be P-realizable by appropriately 
symmetrizing the condition with respect to the action of G-p. We now illustrate 



this approach by generalizing Theorem 3.4 and our coherence based condition (Theo- 
rem 4.2 1 for a subspace to be P-realizable. In each case we first define an appropriately 
symmetrized version of the original condition. The natural symmetrized version of 
the notion of balance is as follows. 

Definition 5.8. A vector u e W l is ^-balanced if for all I e V 

\Wxh < W^h- 
jev\{i} 



We next define the appropriately symmetrized analogue of coherence. Just as co- 
herence measures how far a subspace is from any one-dimensional coordinate subspace, 
P-coherence measures how far a subspace is from any of the coordinate subspaces in- 
dexed by elements of V . 

Definition 5.9. The V -coherence of a subspace hi o/R" is 

Hv(Li) — max max || J-J^a;|||. 
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Just as the coherence of U can be computed by taking the maximum diagonal 
element of Pu, it is straightforward to veify that the P-coherence of U can be computed 
by taking the maximum of the spectral norms of the principal submatrices [Pu]i 
indexed by I e V . 

We now use Theorem |5.7| to establish the natural generalization of Theorem |3.4| 
COROLLARY 5.10. If a subspaceU ofR n is V -realizable then every element oflA 

is V -balanced. If U = span{u} is one dimensional then U is V -realizable if and only 

if u is V -balanced. 

Proof. If there is u e U that is not P-balanced then there is P € Gp such that 
Pu is not balanced (choose P so that it rotates each ux until it has only one non-zero 
entry). But then P ■ U is not realizable and so U is not ^-realizable. 

For the converse, we first show that if a vector is ^-balanced then it is balanced. 
Let leP, and consider i el. Then since u is ^-balanced, 



n 

2M<2|M| 2 < ]T \\ujh<Y,Wi\ 

Jev i=i 



and so u is balanced. 

Now suppose U = span{u} is one dimensional and u is P-balanced. Since u is V- 
balanced it follows that Pu is ^-balanced (and hence balanced) every P <E G-p. Then 



by Theorem 3.4 span{Pu} is realizable for every P e G-p. Hence by Theorem 5.7 U 



is ^-realizable. □ 



Similarly, with the aid of Theorem 5.7 we can write down a P-coherence-threshold 



condition that is a sufficient condition for a subspace to be ^-realizable. The following 
is a natural generalization of Theorem |4.2| 

Corollary 5.11. If fip(U) < 1/2 thenU is V -realizable. 

Proof. By examining the constraints in the variational definitions of fi(U) and 
/x-p(W) we see that n(U) < \ip{li). Consequently if fpp(U) < 1/2 it follows from 
Theorem |4.2| that U is realizable. Since fip is invariant under the action of Gp on 
subspaces we can apply Theorem |5.7| to complete the proof. □ 



6. Conclusions. We established a link between three problems of independent 
interest: deciding whether there is a centered ellipsoid passing through a collection 
of points, understanding the structure of the faces of the elliptope, and deciding 
which pairs of diagonal and low rank-matrices can be recovered from their sum using 
a tractable semidefinite-programming-based heuristic, namely minimum trace factor 
analysis. We provided a simple sufficient condition, based on the notion of the co- 
herence of a subspace, which ensures the success of minimum trace factor analysis, 
and showed that this is the best possible coherence-threshold-type sufficient condition 
for this problem. We provided natural generalizations of our results to the problem 
of analyzing tractable block-diagonal and low-rank decompositions, showing how the 
symmetries of this problem allow us to reduce much of the analysis to the original 
diagonal and low-rank case. 

Our results suggest both the power and the limitations of using 'coarse' properties 
of a subspace such as coherence to gain understanding of the faces of the elliptope 
(and related problems). The power of results based on such properties is that they 
do not have explicit dimension-dependence, unlike previous results on the faces of the 
elliptope. At the same time, the lack of explicit dimension dependence typically yields 
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conservative sufficient conditions for high-dimensional problems. It would be inter- 
esting to find a hierarchy of coherence-like conditions that provide less conservative 
sufficient conditions for higher dimensional problem instances. 

Appendix A. Additional proofs. 



A.l. Proof of Lemma |3.5[ We first establish Lemma [375] which gives an inter- 
pretation of the balance condition in terms of ellipsoid fitting. 

Proof. The proof is a fairly straightforward application of linear programming 
duality. Throughout let V be the k x n matrix with columns given by the Wj. The 
point Vi € M. k is on the boundary of the convex hull of ±v\ , . . . , ±w„ if and only if 
there exists x £ M. k such that (x,Vi) — 1 and \(x, Vj)\ < 1 for all j ^ i. Equivalently, 
the following linear program (which depends on i) is feasible 

{yT x = 1 
\Vj x\ < 1 for all j f i. v ' 

Suppose there is some i such that v t is in the interior of conv{±u l5 . . . , ±v„}. Then 



(A.l I is not feasible so the dual linear program (which depends on i) 



maximize Uj — \uj\ subject to Vu = (A. 2) 

is unbounded. This is the case if and only if there is some u in the nullspace of 
V such that ui > \ u j\- If such a u exists, then it is certainly the case that 

| > Ui > \uj \ and so u is not balanced. 

Conversely if u is in the nullspace of V and u is not balanced then either u or — u 



satisfies Ui > J>2jjti \ u j\ f° r some Hence the linear program (A. 2 1 associated with 



the index i is unbounded and so the corresponding linear program (A.l I is infeasible. 



It follows that Vi is in the interior of the convex hull of ±«i, . . . , ±u n . □ 



A. 2. Completing the proof of Theorem |4.2[ We now complete the proof of 
Theorem|4.2|by establishing the following result about the existence of a non- negative 



solution to the linear system (4.2) 



Lemma A.l. If fi(U) < 1/2 then there is A > such that 

diag (P u x diag* (X)P u x ) = 1. (A.3) 



Proof. We note that the linear system (A.3) can be written as P u ± o PyxX = 1 
where o denotes the entry-wise product of matrices. As such, we need to show that 
P u ± o P u ± is invcrtible and (Pu± oP u ±)~ 1 l > 0. To do so, we appeal to the following 
(slight restatement) of a theorem of Walters [3TJ regarding positive solutions to certain 
linear systems. 

Theorem A. 2 (Walters |31j). Suppose A is a square matrix with non-negative 
entries and positive diagonal entries. Let D be a diagonal matrix with D.^ = An for 
all i. If y > and 2y — AD^ 1 y > then A is invertible and A~ x y > 0. 



In our case we take A = P u ± o P u ± and y = 1 in Theorem A. 2 It is clear 



that P u x o P u ±_ is entry-wise non-negative. Furthermore [-F^Jii = 1 — [Pu]u > 
1 — /i(W) > 1/2 and so Da = [Pu± o Pu±]u > 1/4. It then remains to show that 
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P u ± o P u ± D 1 1 < 21. Consider the ith such inequality, and observe that 

= (P u ±D7 i 1 e i e?P u ±). i + (Pu^D- 1 - D^aef )P u x) .. 

< 1 + rnaxDT 1 [P u± (I - ei ef)P u x) 

<l + 4[P M x]„-4[P„x]£ 
= 2-4([P w x] i i-l/2) 2 

< 2 

where we have used the assumption that [PuAa > 1/2 f° r ah i and the fact that 
P^x = Pu ± ■ Applying Walters's theorem completes the proof. □ 



A. 3. Proof of Proposition 4.6 We now establish Proposition |4.6| giving a 
bound on the probability that a suitably random subspace is realizable by bounding 
the probability that it has coherence strictly bounded above by 1/2. 

Proof. It suffices to show that \\P u ei\\ 2 < (1 - 2e)(l/2 - e) = 1/2 - 2e 2 < 1/2 for 
all i with high probability. The main observation we use is that if 14 is a random r 
dimensional subspace of W 1 and x is any fixed vector with ||x|| = 1 then ||P^a;|| 2 ~ 
(3(r/2, (n — r)/2) where j3(p,q) denotes the beta distribution [13]. In the case where 
r = (1/2 — e)n, using a tail bound for j3 random variables [13] we see that if a; € R" 
is fixed and r > 3/e 2 then 

Pr[\\P uX f > (1 + 2 £ )(l/2 - e)] < L -—^-— n -^ e -^ 

where a t — e — 4e 2 /3. Taking a union bound over n events, as long as r > 3/e 2 

Pr [fi(U) > 1/2] < Pr [\\Pue t \\ 2 > (1 - 2e)(l/2 - e) for some % € [n]] 

< n • ? ? , 1 9 „ 1/9 »' 1/2 e' a ' fc = cn 1 /2 e -™ 
o £ (tt(1/4- e 2 )) 1 ^ 

for appropriate positive constants c and c. □ 
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